Condition for perfect dimensionality recovery by variational Bayesian PCA
نویسندگان
چکیده
Having shown its good performance in many applications, variational Bayesian (VB) learning is known to be one of the best tractable approximations to Bayesian learning. However, its performance was not well understood theoretically. In this paper, we clarify the behavior of VB learning in probabilistic PCA (or fully-observed matrix factorization). More specifically, we establish a necessary and sufficient condition for perfect dimensionality (or rank) recovery in the large-scale limit when the matrix size goes to infinity. Our result theoretically guarantees the performance of VB-PCA. At the same time, it also reveals the conservative nature of VB learning—it offers a low false positive rate at the expense of low sensitivity. By contrasting with an alternative dimensionality selection method, we characterize VB learning in PCA. In our analysis, we obtain bounds of the noise variance estimator, and a new and simple analytic-form solution for the other parameters, which themselves are useful for implementation of VB-PCA.
منابع مشابه
Perfect Dimensionality Recovery by Variational Bayesian PCA
The variational Bayesian (VB) approach is one of the best tractable approximations to the Bayesian estimation, and it was demonstrated to perform well in many applications. However, its good performance was not fully understood theoretically. For example, VB sometimes produces a sparse solution, which is regarded as a practical advantage of VB, but such sparsity is hardly observed in the rigoro...
متن کاملSparse Probabilistic Principal Component Analysis
Principal component analysis (PCA) is a popular dimensionality reduction algorithm. However, it is not easy to interpret which of the original features are important based on the principal components. Recent methods improve interpretability by sparsifying PCA through adding an L1 regularizer. In this paper, we introduce a probabilistic formulation for sparse PCA. By presenting sparse PCA as a p...
متن کاملOn Bayesian PCA: Automatic Dimensionality Selection and Analytic Solution
In probabilistic PCA, the fully Bayesian estimation is computationally intractable. To cope with this problem, two types of approximation schemes were introduced: the partially Bayesian PCA (PB-PCA) where only the latent variables are integrated out, and the variational Bayesian PCA (VB-PCA) where the loading vectors are also integrated out. The VB-PCA was proposed as an improved variant of PB-...
متن کاملExtensions of probabilistic PCA
Principal component analysis (PCA) is a classical data analysis technique. Some algorithms for PCA scale better than others to problems with high dimensionality. They also differ in the ability to handle missing values in the data. In our recent paper [1], a case is studied where the data are high-dimensional and a majority of the values are missing. In the case of very sparse data, overfitting...
متن کاملVariational Principal Components
One of the central issues in the use of principal component analysis (PCA) for data modelling is that of choosing the appropriate number of retained components. This problem was recently addressed through the formulation of a Bayesian treatment of PCA (Bishop, 1999a) in terms of a probabilistic latent variable model. A central feature of this approach is that the effective dimensionality of the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 16 شماره
صفحات -
تاریخ انتشار 2015